PKU_ICL at SemEval-2017 Task 10: Keyphrase Extraction with Model Ensemble and External Knowledge

نویسندگان

  • Liang Wang
  • Sujian Li
چکیده

This paper presents a system that participated in SemEval 2017 Task 10 (subtask A and subtask B): Extracting Keyphrases and Relations from Scientific Publications (Augenstein et al., 2017). Our proposed approach utilizes external knowledge to enrich feature representation of candidate keyphrase, includingWikipedia, IEEE taxonomy and pre-trained word embeddings etc. Ensemble of unsupervised models, random forest and linear models are used for candidate keyphrase ranking and keyphrase type classification. Our system achieves the 3rd place in subtask A and 4th place in subtask B.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SZTE-NLP at SemEval-2017 Task 10: A High Precision Sequence Model for Keyphrase Extraction Utilizing Sparse Coding for Feature Generation

In this paper we introduce our system participating at the 2017 SemEval shared task on keyphrase extraction from scientific documents. We aimed at the creation of a keyphrase extraction approach which relies on as little external resources as possible. Without applying any hand-crafted external resources, and only utilizing a transformed version of word embeddings trained at Wikipedia, our prop...

متن کامل

WING-NUS at SemEval-2017 Task 10: Keyphrase Identification and Classification as Joint Sequence Labeling

We describe an end-to-end pipeline processing approach for SemEval 2017’s Task 10 to extract keyphrases and their relations from scientific publications. We jointly identify and classify keyphrases by modeling the subtasks as sequential labeling. Our system utilizes standard, surface-level features along with the adjacent word features, and performs conditional decoding on whole text to extract...

متن کامل

NTNU-2 at SemEval-2017 Task 10: Identifying Synonym and Hyponym Relations among Keyphrases in Scientific Documents

This paper presents our relation extraction system for subtask C of SemEval-2017 Task 10: ScienceIE. Assuming that the keyphrases are already annotated in the input data, our work explores a wide range of linguistic features, applies various feature selection techniques, optimizes the hyper parameters and class weights and experiments with different problem formulations (single classification m...

متن کامل

273. Task 5. Keyphrase Extraction Based on Core Word Identification and Word Expansion

This paper provides a description of the Hong Kong Polytechnic University (PolyU) System that participated in the task #5 of SemEval-2, i.e., the Automatic Keyphrase Extraction from Scientific Articles task. We followed a novel framework to develop our keyphrase extraction system, motivated by differentiating the roles of the words in a keyphrase. We first identified the core words which are de...

متن کامل

KX: A Flexible System for Keyphrase eXtraction

In this paper we present KX, a system for keyphrase extraction developed at FBK-IRST, which exploits basic linguistic annotation combined with simple statistical measures to select a list of weighted keywords from a document. The system is flexible in that it offers to the user the possibility of setting parameters such as frequency thresholds for collocation extraction and indicators for keyph...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017